Two Dimensional Principal Component Analysis for Online Tamil Character Recognition
نویسنده
چکیده
This paper presents a new application of two dimensional Principal Component Analysis (2DPCA) to the problem of online character recognition in Tamil Script. A novel set of features employing polynomial fits and quartiles in combination with conventional features are derived for each sample point of the Tamil character obtained after smoothing and resampling. These are stacked to form a matrix, using which a covariance matrix is constructed. A subset of the eigenvectors of the covariance matrix is employed to get the features in the reduced sub space. Each character is modeled as a separate subspace and a modified form of the Mahalanobis distance is derived to classify a given test character. Results indicate that the recognition accuracy using the 2DPCA scheme shows an approximate 3% improvement over the conventional PCA technique.
منابع مشابه
Multilingual OCR system for South Indian scripts and English documents: An approach based on Fourier transform and principal component analysis
Character recognition lies at the core of the discipline of pattern recognition where the aim is to represent a sequence of characters taken from an alphabet [Kasturi, R., Gorman, L.O., Govindaraju, V., 2002. Document image analysis: a primer. Sadhana 27 (Part 1), 3–22]. Though many kinds of features have been developed and their test performances on standard database have been reported, there ...
متن کاملResolving Ambiguities in Confused Online Tamil Characters with Post Processing Algorithms
This paper addresses the problem of resolving ambiguities in frequently confused online Tamil character pairs by employing script specific algorithms as a post classification step. Robust structural cues and temporal information of the preprocessed character are extensively utilized in the design of these algorithms. The methods are quite robust in automatically extracting the discriminative su...
متن کاملAn Investigation on the Performance of Hybrid Features for Feed Forward Neural Network Based English Handwritten Character Recognition System
Optical Characters Recognition (OCR) is one of the active subjects of research in the field of pattern recognition. The two main stages in the OCR system are feature extraction and classification. In this paper, a new hybrid feature extraction technique and a neural network classifier are proposed for off-line handwritten English character recognition system. The hybrid features are obtained by...
متن کاملSupervised Feature Extraction of Face Images for Improvement of Recognition Accuracy
Dimensionality reduction methods transform or select a low dimensional feature space to efficiently represent the original high dimensional feature space of data. Feature reduction techniques are an important step in many pattern recognition problems in different fields especially in analyzing of high dimensional data. Hyperspectral images are acquired by remote sensors and human face images ar...
متن کاملAn Improved Handwritten Tamil Character Recognition System using Octal Graph
Problem Statement: Handwriting recognition has attracted voluminous research in recent times. The segmentation and recognition of the characters from handwritten scripts incorporates considerable overhead. Almost all the existing handwritten character recognition techniques use neural network approach, which requires lot of preprocessing and hence accomplishing these problems using neural netwo...
متن کامل